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Parton Density Functions (PDFs) and their uncertainties are extremely important topics for both 
the Tevatron and the LHC. Experiments at the Tevatron can enhance this knowledge not only by con- 
straining the PDF fits, but also by developing and refining the available PDF tools through feed-back 
from the experiments that are currently analyzing the highest energy hadron collider data available. 
It is important that the community has standardized tools and methods at its disposal. In this note 
we summarize briefly the most recent developments of the The Les Houches Accord PDF (LHAPDF), 
which is the modern replacement for PDFLIB. We also outline and compare the methods of quan- 
tifying the impact of PDF uncertainties on physical observables. The PDF weighting method for 
propagating errors from PDFs to event generator observables is outlined in detail, and example code 
for using this method with PYTHIA is also included. 



The experimental errors in current and future hadron colliders are expected to decrease to a level that will chal- 
lenge the uncertainties in theoretical calculations. One important component in the prediction of uncertainties at 
hadron colliders comes from the Parton Density Functions (PDFs) of the (anti)proton. 

The highest energy particle colliders in the world currently, and in the near future, collide hadrons. To make 
predictions of hadron collisions, the parton cross sections must be folded with the parton density functions: 



a - cross section for the partonic subprocess ij — > X 
xi, x 2 - parton momentum fractions, 

fi/ p (p) (xi) - probability to find a parton i with momentum fraction a;, in the (anti)proton. 

A long standing problem when performing such calculations is to quantify the uncertainty of the results coming 
from our limited knowledge of the PDFs. Even if the parton cross section a is known very precisely, there may be a 
sizable error on the hadronic cross section a due to the PDF uncertainty. 

The Tevatron can contribute to PDF knowledge in many ways that will benefit the experiments at the LHC. First, 
measurements made by the experiments at FNAL will reduce PDF uncertainties by constraining PDF fits. Perhaps 
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more importantly tools and techniques for propagating PDF uncertainty through to physical observables can be 
improved and tested at the Tevatron. 

Next-to-leading order (NLO) is the first order at which the the normalization of the hard-scattering cross sections 
has a reasonable uncertainty. Therefore, this is the first order at which PDF uncertainties are usually applied. To 
date, all PDF uncertainties have been calculated in the context of NLO global analysis. However, useful information 
can still be obtained from NLO PDF uncertainties with leading order (LO) calculations and parton shower Monte 
Carlos 01. 

Techniques and tools for calculating PDF uncertainty in the context of LO parton shower Monte Carlos will be 
the primary topic of this document. Examples are provided employing CTEQ6 [2] error sets from LHAPDF and the 
parton shower Monte Carlo program PYTHIA [3]. 

II. LHAPDF UPDATE 
A. Recent developments 

Historically, the CERN PDFLIB library 0] has provided a widely used standard FORTRAN interface to PDFs 
with interpolation grids built into the PDFLIB code itself. However, it was realized that PDFLIB would be increas- 
ingly unable to meet the needs of the new generation of PDFs which often involve large numbers of sets («20-40) 
describing the uncertainties on the individual partons from variations in the fitted parameters. As a consequence 
of this, at the Les Houches meeting in 2001 |5], the beginnings of a new interface were conceived — the so-called 
Les Houches Accord PDF (LHAPDF). The LHAGLUE package |6] plus a unique PDF numbering scheme enables 
LHAPDF to be used in the same way as PDFLIB, without requiring any changes in the PYTHIA or HERWIG codes. 
The evolution of LHAPDF (and LHAGLUE) up to summer 2005 is well documented 1 7 ] . 

Recently, LHAPDF has been further improved. With the release of v4.1 in August of 2005 the installation method 
has been upgraded to the more conventional configure; make; make install. Version 4.2, released in 
November of 2005, includes the new cteq6AB (variable a(Mz)) PDF sets. It also includes new modifications by 
the CTEQ group to other cteq code to improve speed. Some minor bugs were also fixed in this version that affected 
the a02m_nnlo.LHgrid file (previous one was erroneously the same as LO) and SMRSPI code which was wrongly 
setting u sea to zero. 

A v5 version, with the addition of the option to store PDFs from multiple sets in memory, has been released. 
This new functionality speeds up the code by making it possible to store PDF results from many sets while only 
generating a MC sample once without significant loss of speed. 

B. Consistency checks 

As a technical check, cross sections have been computed, as well as errors where appropriate, for all PDF sets 
included in LHAPDF. 10,000 events are generated for each member of a PDF set for both HERWIG |8|] and PYTHIA 1 3], 
and at both Tevatron and LHC energies. As this study serves simply as a technical check of the interface, no attempt 
was made to unfold the true PDF error. The maximum Monte Carlo variance (integration error) in our checks is 
less than 1 percent. This has not been subtracted and will result in an overestimate of the true PDF uncertainty by a 
factor <^ 1.05 in our analysis. The results in general show good agreement for most PDFs included in the checks. 
Overall the consistency is better for Tevatron energies, where we do not have to make large extrapolations to the 
new energy domain and much broader phase space covered by the LHC. 
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Two complementary processes are used: 

• Drell-Yan Pairs (/i + /i~): the Drell-Yan process is chosen here to probe the functionality of the quark PDFs 
included in the LHAPDF package. 

• Higgs Production: the cross section for gg — > H probes the gluon PDFs, so this channel is complementary to 
the case considered above. 

III. PDF UNCERTAINTIES 

As stated above, the need to understand and reduce PDF uncertainties in theoretical predictions for collider 
physics is of paramount importance. One of the first signs of this necessity was the apparent surplus of high Pt 
events observed in the inclusive jet cross section in the CDF experiment at FNAL in run I. Subsequent analysis of 
the PDF uncertainty in this kinematic region indicated that the deviation was within the range of the PDF domi- 
nated theoretical uncertainty on the cross section. Indeed, when the full jet data from the Tevatron (including the 
DO measurement over the full rapidity range) was included in the global PDF analysis, the enhanced high x gluon 
preferred by CDF jet data from Run I became the central solution. This was an overwhelming sign that PDF un- 
certainty needed to be quantified 0. Below, a short review of one approach to quantify these uncertainties called 
the Hessian matrix method is given, followed by outlines of two methods used to calculate the PDF uncertainty on 
physical observables. 

A. Review of the Hessian Method 

Experimental constraints must be incorporated into the uncertainties of parton distribution functions before these 
uncertainties can be propagated through to predictions of observables. The Hessian Method fulll both constructs 
a N Eigenvector Basis of PDFs and provides a method from which uncertainties on observables can be calculated. 
The first step of the Hessian method is to make a fit to data using N free parameters. The global \ 2 of this fit is 
minimized yielding a central or best fit parameter set Sq. Next the global \ 2 is increased to form the Hessian error 
matrix: 

N N 

1=1 j=l 

This matrix can then be diagonalized yielding N (20 for CTEQ6) eigenvectors. Each eigenvector probes a direction 
in PDF parameter space that is a combination of the 20 free parameters used in the global fit. The largest eigenvalues 
correspond to the best determined directions and the smallest eigenvalues to the worst determined directions in 
PDF parameter space. For the CTEQ6 error PDF set, there is a factor of roughly one million between the largest 
and smallest eigenvectors. The eigenvectors are numbered from highest eigenvalue to lowest eigenvalue. Each N 
eigenvector direction is then varied up and down within tolerance to obtain 2N new parameter sets, Sf 1 (i — 1,..,N). 
These parameter sets each correspond to a member of the PDF set, F.f 1 = F(x,Q\ Sf). The PDF library described 
above, LHAPDF, provides standard access to these PDF sets. 
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B. 'Master' Equations 



Although the variations applied in the eigenvector directions are symmetric by construction, this is not always the 
case for the result of these variations when propagated through to an observable. In general the well constrained 
directions (low eigenvector numbers) tend to have symmetric positive and negative deviations on either side of 
the central value of the observable (X ). This can not be counted on in the case of the smaller eigenvalues (larger 
eigenvector numbers). The 2N+1 members of the PDF set provide 2N+1 results for any observable of interest. Two 
methods for obtaining a set of results are described in detail below. Once results are obtained they can be used to 
approximate PDF uncertainty through the use of a 'Master Equation'. Although many versions of these equations 
can be found in the literature, the type which considers maximal positive and negative variations of the physical 
observable separately, known as "modified tolerance method", is preferred 



JV 



AX2 



\ 



Y}max(X+ - X ,Xr - X o ,0)f 



(3) 



N 

J2[^ax(Xo - X+,X - Xr, 0)] 2 (4) 

i=l 

Other forms of 'Master' equations with their flaws are summarized: 

• AX = ^El^-Xr) 2 
This is the original CTEQ 'Master Formula'. It correctly predicts uncertainty on the PDF values since in the 
PDF basis Xf and X~ are symmetric by construction. However, for physical observables this equation will 
underestimate the uncertainty if Xf and X~ lie on the same side of X . 



A X~ = 

max \ 



AX 2 = kyU^l R i ( R l = X l - X °' R 2 = X l - X 0> R 3 = X 2 - X -) 

If Xf and X~ lie on the same side of Xq this equation adds contributions from both in quadrature. NOTE: 
For symmetric and asymmetric deviations, AXx varies from -> ^AXj 



• positive and negative variations based on eigenvector directions 

AX+ = y/zl^Xf-Xo)*, AX- = y/zl^Xr - X f 

Since the positive and negative directions defined in the PDF eigenvector space are not always related to pos- 
itive and negative variations on an observable these equations can not be interpreted as positive and negative 
errors in the general case. 



C. Summary of Techniques 



Two main techniques are currently employed to study the effect of PDF uncertainties of physical observables. 
Both techniques work with the PDF sets derived from the Hessian method. 



4 



1. 'Brute Force' 



The 'brute force' method simply entails running the MC and obtaining the observable of interest for each PDF in 
the PDF set. This method is robust, and theoretically correct. Unfortunately it can require very large CPU time since 
large statistical samples must be generated in order for the PDF uncertainty to be isolated over statistical variations. 
This method generally is unrealistic when detector simulation is desired. 

Because the effect on the uncertainty of the PDF set members is added in quadrature, the uncertainty is often 
dominated by only a few members of the error set. In this case, a variation of the 'brute force' method can be 
applied. Once the eigenvectors that the observable is most sensitive to are determined, MC samples only need to be 
generated for the members corresponding to the variation of these eigenvalues. This method will always slightly 
underestimate the true uncertainty. 



2. PDF Weights 

As mentioned above, often it is not possible to generate the desired MC sample many times in order to obtain the 
uncertainty on the observable due to the PDF. The 'PDF Weights' method solves this problem |1]. The idea is that 
the PDF contribution to EquationHmay be factored out. That is, for each event generated with the central PDF from 
the set, a PDF weight (W° = 1, W l n = ^['q-^^q^Sq) where n = l--N events ,i = l..N PDF ) can be stored for each 
event. The PDF weight technique can be summarized as follows... 

• Only one MC sample is generated but 2N (e.g. 40) PDF weights are obtained for 

w° = i w i = f( Xi >Q'> SiS )f( X2 >Q> Si ) ( 5 ) 

f(x 1 ,Q;S )f{x 2 ,Q;S Q ) 

where n = l...N events ,i = l..N PDF 

• Only one run, so kinematics do not change and there is no residual statistical variation in uncertainty. 

• The observable must be weighted on an event by event basis for each PDF of the set. One can either store a 
ntuple of weights to be used 'offline', or fill a set of weighted histograms (one for each PDF in the set). 

The benefits of the weighting technique are twofold. First, only one sample of MC must be generated. Second, 
since the observable for each PDF member is obtained from the same MC sample there is no residual statistical 
fluctuation in the estimate of the PDF uncertainty. One concern involving this method is that re-weighting events 
does not correctly modify the Sudakov form factors. However, the difference in this effect due to varying the PDF 
was shown to be negligible 1 13]. That is, the initial state parton shower created with the central PDF (CTEQ6.1) also 
accurately represents the parton shower that would be produced by any other PDF in the error set. 

The weighting method is only theoretically correct in the limit that all possible initial states are populated. For 
this reason, it is important that reasonable statistical samples are generated when using this technique. Any analysis 
which is sensitive to the extreme tails of distributions should use this method with caution. 

There are two options for using the PDF weighting technique. One can either store 2N (e.g. 40 for CTEQ) weights 
for each event, or store X±, X2, Fi, F2, and Q 2 and calculate the weights 'offline'. The momentum of the two in- 
coming partons may be obtained from PYTHIA via PARI(33) and PARI(34). Flavour types of the 2 initial partons are 
stored in F% = M ST I (15) and F 2 = M ST I (16), and the numbering scheme is the same as the one used by LHAPDF, 
Table|J] except that the gluon is labeled '21' rather than '0'. The Q 2 of the interaction is stored in Q 2 = PARI (24). In 
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theory, this information and access to LHAPDF is all that is needed to use the PDF weights method. This approach 
has the additional benefit of enabling the 'offline reweighting' with new PDF sets, which have not been used, or 
even existed, during the MC generation. We plan to include sample code facilitating the use of PDF weights in 
future releases of LHAPDF. 

TABLE I: The flavour enumeration scheme used for f(n) in LHAPDF 
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IV. EXAMPLE STUDIES 
A. Drell-Yan at the LHC 



The Drell-Yan process is chosen as an almost ideal test case involving quark PDFs for the different flavours. 



I Parton Kinematics for Drell Yan at LHC~ 
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. M, 


=2000 GeV 
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FIG. 1: The parton kinematics for Drell-Yan production at the LHC for three Drell-Yan mass choices. Also the initial parton 
flavour content for the three cases is shown. 

The initial state parton kinematics and flavour contributions are given in Figure ^ for three regions of invariant 
mass of the final state lepton pair: 70 < M < 120, M > 1000, M > 2000 GeV. As we can observe, they cover very 
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wide range in X and Q 2 . It is interesting to note that the flavour composition around the Z peak contains important 
contributions from five flavours, while at high mass the u and d quarks (in ratio 4:1) dominate almost completely. 



B. Higgs Production in gg^H at the LHC 



This channel is chosen as complementary to the first one and contains only contributions from the gluon PDF. A 
light Higgs mass of 120 GeV is selected. 



C. Inclusive Jet Cross Section at the Tevatron and the LHC 



As mentioned above, the inclusive jet cross section was one of the first measurements where the need to quantify 
PDF uncertainty was evident. QCD 2-2 processes are studied for Pt > 500 GeV. The kinematic range probed can 
be seen in Figure |3 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 



FIG. 2: The partonic jet kinematics for the inclusive jet cross section at the Tevatron and the LHC. 



D. Results 



The results for all 3 studies are summarized in Table ITTl The weighting technique produces the same results as the 
more elaborate 'brute force' approach for all cases. 



V. SUMMARY 



In this talk new developments of LHAPDF and consistency checks for all PDF sets are described. The approaches 
to PDF uncertainty analysis are outlined and the modern method of PDF weighting is described in detail and tested 
in different channels of current interest. Drell-Yan, gluon fusion to Higgs, and high Pp jet production are studied 
at the Tevatron and LHC energy scales. The methods are in agreement in all cases. Equations for quantifying PDF 
uncertainty are discussed and the type which relies on maximal positive and negative variations on the observable 
is considered superior. 
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TABLE II: Results for the 3 case studies. The central values of the cross sections in [pb] are shown, followed by the estimates of 
the uncertainties for the different master equations and the 'brute force' (B.E) and weighting (W) techniques. 
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